Efficient Calculation of the Gauss-Newton Approximation of the Hessian Matrix in Neural Networks

نویسندگان

  • Michael Fairbank
  • Eduardo Alonso
چکیده

The Levenberg-Marquardt (LM) learning algorithm is a popular algorithm for training neural networks; however, for large neural networks, it becomes prohibitively expensive in terms of running time and memory requirements. The most time-critical step of the algorithm is the calculation of the Gauss-Newton matrix, which is formed by multiplying two large Jacobian matrices together. We propose a method that uses backpropagation to reduce the time of this matrix-matrix multiplication. This reduces the overall asymptotic running time of the LM algorithm by a factor of the order of the number of output nodes in the neural network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using an Efficient Penalty Method for Solving Linear Least Square Problem with Nonlinear Constraints

In this paper, we use a penalty method for solving the linear least squares problem with nonlinear constraints. In each iteration of penalty methods for solving the problem, the calculation of projected Hessian matrix is required. Given that the objective function is linear least squares, projected Hessian matrix of the penalty function consists of two parts that the exact amount of a part of i...

متن کامل

Block-diagonal Hessian-free Optimization for Training Neural Networks

Second-order methods for neural network optimization have several advantages over methods based on first-order gradient descent, including better scaling to large mini-batch sizes and fewer updates needed for convergence. But they are rarely applied to deep learning in practice because of high computational cost and the need for model-dependent algorithmic variations. We introduce a variant of ...

متن کامل

Preconditioning for Hessian-Free Optimization

Recently Martens adapted the Hessian-free optimization method for the training of deep neural networks. One key aspect of this approach is that the Hessian is never computed explicitly, instead the Conjugate Gradient(CG) Algorithm is used to compute the new search direction by applying only matrix-vector products of the Hessian with arbitrary vectors. This can be done efficiently using a varian...

متن کامل

Block-diagonal Hessian-free Optimization

Second-order methods for neural network optimization have several advantages over methods based on first-order gradient descent, including better scaling to large mini-batch sizes and fewer updates needed for convergence. But they are rarely applied to deep learning in practice because of high computational cost and the need for model-dependent algorithmic variations. We introduce a variant of ...

متن کامل

A New Load-Flow Method in Distribution Networks based on an Approximation Voltage-Dependent Load model in Extensive Presence of Distributed Generation Sources

Power-flow (PF) solution is a basic and powerful tool in power system analysis. Distribution networks (DNs), compared to transmission systems, have many fundamental distinctions that cause the conventional PF to be ineffective on these networks. This paper presents a new fast and efficient PF method which provides all different models of Distributed Generations (DGs) and their operational modes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neural computation

دوره 24 3  شماره 

صفحات  -

تاریخ انتشار 2012